Fitted Modelling

Terminology

Regression: Look for relationships amongst variables. Using for determine how multiple variables are related or predict a value. Correlation coefficient: the covariance of the variables divided by the product of their standard deviations Residuals = the distances between the observed values and the predicted values Ordinary least squares (OLS) = minimises the sum of squared residuals (SSR)

Polynomial Regression

Bayesian information criterion (BIC): includes a penalty for using more variables. Other similar measures include the adjusted-R2 • Poor fit due to high bias called under-fitting • Poor fit due to low bias called overfitting(过度拟合) Split up the data we have into two non-overlapping parts, a training set and a test set Bias: measures how much the prediction differs from the desired regression function. Variance: measures how much the predictions for individual data sets vary around their average.